Tight bound on the maximum number of shortest unique substrings
نویسندگان
چکیده
A substring Q of a string S is called a shortest unique substring (SUS) for position p in S, if Q occurs exactly once in S, this occurrence of Q contains position p, and every substring of S which contains position p and is shorter than Q occurs at least twice in S. The SUS problem is, given a string S, to preprocess S so that for any subsequent query position p all the SUSs for position p can be answered quickly. There exist optimal O(n)-time preprocessing scheme which answers queries in optimal O(k) time, where n is the length of S and k is the number of SUSs for a query position. In this paper, we reveal structural, combinatorial properties underlying this problem: Namely, we show that the number of intervals in S that correspond to SUSs for all positions in S is less than 1.5n. We also show that this is a matching upper and lower bound.
منابع مشابه
Hypo-efficient domination and hypo-unique domination
For a graph $G$ let $gamma (G)$ be its domination number. We define a graph G to be (i) a hypo-efficient domination graph (or a hypo-$mathcal{ED}$ graph) if $G$ has no efficient dominating set (EDS) but every graph formed by removing a single vertex from $G$ has at least one EDS, and (ii) a hypo-unique domination graph (a hypo-$mathcal{UD}$ graph) if $G$ has at least two minimum dominating sets...
متن کاملTight Bounds on the Maximum Number of Shortest Unique Substrings
A substring Q of a string S is called a shortest unique substring (SUS) for interval [s, t] in S, if Q occurs exactly once in S, this occurrence of Q contains interval [s, t], and every substring of S which contains interval [s, t] and is shorter than Q occurs at least twice in S. The SUS problem is, given a string S, to preprocess S so that for any subsequent query interval [s, t] all the SUSs...
متن کاملWorst Case Bounds for Shortest Path Interval Routing
Consider shortest path interval routing a popular memory balanced method for solving the routing problem on arbitrary networks Given a network G let Irs G denote the maximum number of intervals necessary to encode groups of destinations on an edge minimized over all shortest path interval routing schemes on G In this paper we establish tight worst case bounds on Irs G More precisely for any n w...
متن کاملMinimum Unique Substrings and Maximum Repeats
Unique substrings appear scattered in the stringology literature and have important applications in bioinformatics. In this paper we initiate a study of minimum unique substrings in a given string; that is, substrings that occur exactly once while all their substrings are repeats. We discover a strong duality between minimum unique substrings and maximum repeats which, in particular, allows fas...
متن کاملA Genetic Algorithm for the Shortest Common Superstring Problem
Many real world problems can be modeled as the shortest common superstring problem. This problem has several important applications in areas such as DNA sequencing and data compression. The shortest common superstring problem (SCS) can be formulated as follows. Given a set of strings S = {s1, s2, ..., sn} the goal is to find the shortest string N∗ such that each si ∈ S is a string of N∗. Findin...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1609.07220 شماره
صفحات -
تاریخ انتشار 2016